The search functionality is under construction.

Author Search Result

[Author] Tsutomu SASAO(42hit)

21-40hit(42hit)

  • Bi-Partition of Shared Binary Decision Diagrams

    Munehiro MATSUURA  Tsutomu SASAO  Jon T. BUTLER  Yukihiro IGUCHI  

     
    PAPER-Logic Synthesis

      Vol:
    E85-A No:12
      Page(s):
    2693-2700

    A shared binary decision diagram (SBDD) represents a multiple-output function, where nodes are shared among BDDs representing the various outputs. A partitioned SBDD consists of two or more SBDDs that share nodes. The separate SBDDs are optimized independently, often resulting in a reduction in the number of nodes over a single SBDD. We show a method for partitioning a single SBDD into two parts that reduces the node count. Among the benchmark functions tested, a node reduction of up to 23% is realized.

  • Optimization of Pseudo-Kronecker Expressions Using Multiple-Place Decision Diagrams

    Tsutomu SASAO  

     
    PAPER-Logic Design

      Vol:
    E76-D No:5
      Page(s):
    562-570

    This paper presents an optimization method for pseudo-Kronecker expressions of p-valued input two-valued output functions by using multi-place decision diagrams for p2 and p4. A conventional method using extended truth tables requires memory of O (3n) to simplify an n-variable expression, and is only practical for functions of up to n14 variables when p2. The method presented here utilizes multi-place decision diagrams, and can optimize considerably larger problems. Experimental results for up to n39 variables are shown.

  • Classification Functions for Handwritten Digit Recognition

    Tsutomu SASAO  Yuto HORIKAWA  Yukihiro IGUCHI  

     
    PAPER-Logic Design

      Pubricized:
    2021/04/01
      Vol:
    E104-D No:8
      Page(s):
    1076-1082

    A classification function maps a set of vectors into several classes. A machine learning problem is treated as a design problem for partially defined classification functions. To realize classification functions for MNIST hand written digits, three different architectures are considered: Single-unit realization, 45-unit realization, and 45-unit ×r realization. The 45-unit realization consists of 45 ternary classifiers, 10 counters, and a max selector. Test accuracy of these architectures are compared using MNIST data set.

  • A Packet Classifier Based on Prefetching EVMDD (k) Machines

    Hiroki NAKAHARA  Tsutomu SASAO  Munehiro MATSUURA  

     
    PAPER-Logic Design

      Vol:
    E97-D No:9
      Page(s):
    2243-2252

    A Decision Diagram Machine (DDM) is a special-purpose processor that has special instructions to evaluate a decision diagram. Since the DDM uses only a limited number of instructions, it is faster than the general-purpose Micro Processor Unit (MPU). Also, the architecture for the DDM is much simpler than that for an MPU. This paper presents a packet classifier using a parallel EVMDD (k) machine. To reduce computation time and code size, first, a set of rules for a packet classifier is partitioned into groups. Then, the parallel EVMDD (k) machine evaluates them. To further speed-up for the standard EVMDD (k) machine, we propose the prefetching EVMDD (k) machine which reads both the index and the jump address at the same time. The prefetching EVMDD (k) machine is 2.4 times faster than the standard one using the same memory size. We implemented a parallel prefetching EVMDD (k) machine consisting of 30 machines on an FPGA, and compared it with the Intel's Core i5 microprocessor running at 1.7GHz. Our parallel machine is 15.1-77.5 times faster than the Core i5, and it requires only 8.1-58.5 percents of the memory for the Core i5.

  • On the Numbers of Products in Prefix SOPs for Interval Functions

    Infall SYAFALNI  Tsutomu SASAO  

     
    PAPER-Computer System

      Vol:
    E96-D No:5
      Page(s):
    1086-1094

    First, this paper derives the prefix sum-of-products expression (PreSOP) and the number of products in a PreSOP for an interval function. Second, it derives Ψ(n,τp), the number of n-variable interval functions that can be represented with τp products. Finally, it shows that more than 99.9% of the n-variable interval functions can be represented with ⌈ n - 1 ⌉ products, when n is sufficiently large. These results are useful for a fast PreSOP generator and for estimating the size of ternary content addressable memories (TCAMs) for packet classification.

  • A Virus Scanning Engine Using an MPU and an IGU Based on Row-Shift Decomposition

    Hiroki NAKAHARA  Tsutomu SASAO  Munehiro MATSUURA  

     
    PAPER-Application

      Vol:
    E96-D No:8
      Page(s):
    1667-1675

    This paper shows a virus scanning engine using two-stage matching. In the first stage, a binary CAM emulator quickly detects a part of the virus pattern, while in the second stage, the MPU detects the full length of the virus pattern. The binary CAM emulator is realized by an index generation unit (IGU) based on row-shift decomposition. The proposed system uses two off-chip SRAMs and a small FPGA. Thus, the cost and the power consumption are lower than the TCAM-based system. The system loaded 1,290,617 ClamAV virus patterns. As for the area and throughput, this system outperforms existing two-stage matching systems using FPGAs.

  • A New Equivalence Relation of Logic Functions and Its Application in the Design of AND-OR-EXOR Networks

    Debatosh DEBNATH  Tsutomu SASAO  

     
    PAPER

      Vol:
    E90-A No:5
      Page(s):
    932-940

    This paper presents a design method for AND-OR-EXOR three-level networks, where a single two-input exclusive-OR (EXOR) gate is used. The network realizes an EXOR of two sum-of-products expressions (EX-SOPs). The problem is to minimize the total number of products in the two sum-of-products expressions (SOPs). We introduce the notion of µ-equivalence of logic functions to develop exact minimization algorithms for EX-SOPs with up to five variables. We minimized all the NP-representative functions for up to five variables and showed that five-variable functions require 9 or fewer products in minimum EX-SOPs. For n-variable functions, minimum EX-SOPs require at most 9·2n-5 (n ≤ 6) products. This upper bound is smaller than 2n-1, which is the upper bound for SOPs. We also found that, for five-variable functions, on the average, minimum EX-SOPs require about 40% fewer literals than minimum SOPs.

  • Fault Diagnosis for RAMs Using Walsh Spectrum

    Atsumu ISENO  Yukihiro IGUCHI  Tsutomu SASAO  

     
    PAPER-Memory Testing

      Vol:
    E87-D No:3
      Page(s):
    592-600

    In this paper, we show a method to locate a single stuck-at fault of a random access memory (RAM). From the fail-bitmaps of the RAM, we obtain their Walsh spectrum. For a single stuck-at fault, we show that the fault can be identified and located by using only the 0-th and 1-st coefficients of the spectrum. We also show a circuit to compute these coefficients. The computation time is O(2n), where n is the number of bits in the address of the RAM. The computation time is much shorter than one that uses a logic minimization method.

  • Fast Boolean Matching under Permutation by Efficient Computation of Canonical Form

    Debatosh DEBNATH  Tsutomu SASAO  

     
    PAPER-Logic Synthesis

      Vol:
    E87-A No:12
      Page(s):
    3134-3140

    Checking the equivalence of two Boolean functions under permutation of the variables is an important problem in the synthesis of multiplexer-based field-programmable gate arrays (FPGAs), and the problem is known as Boolean matching. This paper presents an efficient breadth-first search technique for computing a canonical form--namely P-representative--of Boolean functions under permutation of the variables. Two functions match if they have the same P-representative. On an ordinary workstation, on the average, the method requires several microseconds to check the Boolean matching of functions with up to eight variables against a library with tens of thousands of cells.

  • Design Methods of Radix Converters Using Arithmetic Decompositions

    Yukihiro IGUCHI  Tsutomu SASAO  Munehiro MATSUURA  

     
    PAPER-Computer Components

      Vol:
    E90-D No:6
      Page(s):
    905-914

    In arithmetic circuits for digital signal processing, radixes other than two are often used to make circuits faster. In such cases, radix converters are necessary. However, in general, radix converters tend to be complex. This paper considers design methods for p-nary to binary converters. First, it considers Look-Up Table (LUT) cascade realizations. Then, it introduces a new design technique called arithmetic decomposition by using LUTs and adders. Finally, it compares the amount of hardware and performance of radix converters implemented by FPGAs. 12-digit ternary to binary converters on Cyclone II FPGAs designed by the proposed method are faster than ones by conventional methods.

  • A Method to Find Linear Decompositions for Incompletely Specified Index Generation Functions Using Difference Matrix

    Tsutomu SASAO  Yuta URANO  Yukihiro IGUCHI  

     
    PAPER-Logic Synthesis, Test and Verification

      Vol:
    E97-A No:12
      Page(s):
    2427-2433

    This paper shows a method to find a linear transformation that reduces the number of variables to represent a given incompletely specified index generation function. It first generates the difference matrix, and then finds a minimal set of variables using a covering table. Linear transformations are used to modify the covering table to produce a smaller solution. Reduction of the difference matrix is also considered.

  • Efficient Computation of Canonical Form under Variable Permutation and Negation for Boolean Matching in Large Libraries

    Debatosh DEBNATH  Tsutomu SASAO  

     
    PAPER-Logic Synthesis

      Vol:
    E89-A No:12
      Page(s):
    3443-3450

    This paper presents an efficient technique for solving a Boolean matching problem in cell-library binding, where the number of cells in the library is large. As a basis of the Boolean matching, we use the notion NP-representative (NPR): two functions have the same NPR if one can be obtained from the other by a permutation and/or complementation(s) of the variables. By using a table look-up and a tree-based breadth-first search strategy, our method quickly computes the NPR for a given function. Boolean matching of the given function against the whole library is determined by checking the presence of its NPR in a hash table, which stores NPRs for all the library functions and their complements. The effectiveness of our method is demonstrated through experimental results, which show that it is more than two orders of magnitude faster than the Hinsberger-Kolla's algorithm.

  • A Memory-Based IPv6 Lookup Architecture Using Parallel Index Generation Units

    Hiroki NAKAHARA  Tsutomu SASAO  Munehiro MATSUURA  Hisashi IWAMOTO  Yasuhiro TERAO  

     
    PAPER-Architecture

      Pubricized:
    2014/11/19
      Vol:
    E98-D No:2
      Page(s):
    262-271

    In the era of IPv6, since the number of IPv6 addresses rapidly increases and the required speed is more than Giga lookups per second (GLPS), an area-efficient and high-speed IP lookup architecture is desired. This paper shows a parallel index generation unit (IGU) for memory-based IPv6 lookup architecture. To reduce the size of memory in the IGU, we use a linear transformation and a row-shift decomposition. A single-memory realization requires O(2l log k) memory size, where l denotes the length of prefix, while the realization using IGU requires O(kl) memory size, where k denotes the number of prefixes. In IPv6 prefix lookup, since l is at most 64 and k is about 340 K, the IGU drastically reduces the memory size. Also, to reduce the cost, we realize the parallel IGU by using both on-chip and off-chip memories. We show a design algorithm for the parallel IGU to store given off-chip and on-chip memories. The parallel IGU has a simple architecture and performs lookup by using complete pipelines those insert the pipeline registers in all the paths. We loaded more than 340 K IPv6 pseudo prefixes on the Xilinx Virtex 6 FPGA with off-chip DDRII+ Static RAMs (SRAMs). Its lookup speed is 1.100 giga lookups per second (GLPS) which is sufficient for the required speed for a next generation 400 Gbps link throughput. As for the normalized area and lookup speed, our implementation outperforms existing FPGA implementations.

  • FOREWORD

    Tsutomu SASAO  

     
    FOREWORD

      Vol:
    E82-D No:5
      Page(s):
    909-909
  • FOREWORD

    Tsutomu SASAO  

     
    FOREWORD

      Vol:
    E80-D No:10
      Page(s):
    973-973
  • A Fast Updatable Implementation of Index Generation Functions Using Multiple IGUs

    Tsutomu SASAO  

     
    PAPER-Logic Design

      Pubricized:
    2017/05/19
      Vol:
    E100-D No:8
      Page(s):
    1574-1582

    This paper presents a method to realize index generation functions using multiple Index Generation Units (IGUs). The architecture implements index generation functions more efficiently than a single IGU when the number of registered vectors is very large. This paper proves that independent linear transformations are necessary in IGUs for efficient realization. Experimental results confirm this statement. Finally, it shows a fast update method to IGUs.

  • Minimization of AND-OR-EXOR Three-Level Networks with AND Gate Sharing

    Debatosh DEBNATH  Tsutomu SASAO  

     
    PAPER-Logic Design

      Vol:
    E80-D No:10
      Page(s):
    1001-1008

    This paper presents an exact minimization algorithm for AND-OR-EXOR three-level networks, where a single two-input exclusive-OR (EXOR) gate is used. The network realizes an EXOR of two sum-of-products expressions (EX-SOP), where the two sum-of-products expressions (SOP) can share products. The objective is to minimize the total number of different products in the two SOPs. An algorithm for the exact minimization of EX-SOPs with up to five variables are shown. Up to five variables, EX-SOPs for all the representative functions of NP-equivalence classes were minimized. For five-variable functions, we confirmed that minimum EX-SOPs require up to 9 products. For n-variable functions, minimum EX-SOPs require at most 92n-5 (n6) products.

  • On Optimizations of Edge-Valued MDDs for Fast Analysis of Multi-State Systems

    Shinobu NAGAYAMA  Tsutomu SASAO  Jon T. BUTLER  Mitchell A. THORNTON  Theodore W. MANIKAS  

     
    PAPER-Logic Design

      Vol:
    E97-D No:9
      Page(s):
    2234-2242

    In the optimization of decision diagrams, variable reordering approaches are often used to minimize the number of nodes. However, such approaches are less effective for analysis of multi-state systems given by monotone structure functions. Thus, in this paper, we propose algorithms to minimize the number of edges in an edge-valued multi-valued decision diagram (EVMDD) for fast analysis of multi-state systems. The proposed algorithms minimize the number of edges by grouping multi-valued variables into larger-valued variables. By grouping multi-valued variables, we can reduce the number of nodes as well. To show the effectiveness of the proposed algorithms, we compare the proposed algorithms with conventional optimization algorithms based on a variable reordering approach. Experimental results show that the proposed algorithms reduce the number of edges by up to 15% and the number of nodes by up to 47%, compared to the conventional ones. This results in a speed-up of the analysis of multi-state systems by about three times.

  • Head-Tail Expressions for Interval Functions

    Infall SYAFALNI  Tsutomu SASAO  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E97-A No:10
      Page(s):
    2043-2054

    This paper shows a method to represent interval functions by using head-tail expressions. The head-tail expressions represent greater-than GT(X:A) functions, less-than LT(X:B) functions, and interval functions IN0(X:A,B) more efficiently than sum-of-products expressions. Let n be the number of bits to represent the largest value in the interval (A,B). This paper proves that a head-tail expression (HT) represents an interval function with at most n words in a ternary content addressable memory (TCAM) realization. It also shows the average numbers of factors to represent interval functions by HTs for up to n=16, which were obtained by a computer simulation. It also conjectures that, for sufficiently large n, the average number of factors to represent n-variable interval functions by HTs is at most 2/3n-5/9. Experimental results also show that, for n≥10, to represent interval functions, HTs require at least 20% fewer factors than MSOPs, on the average.

  • Design Method for Numerical Function Generators Using Recursive Segmentation and EVBDDs

    Shinobu NAGAYAMA  Tsutomu SASAO  Jon T. BUTLER  

     
    PAPER-Logic Synthesis and Verification

      Vol:
    E90-A No:12
      Page(s):
    2752-2761

    Numerical function generators (NFGs) realize arithmetic functions, such as ex,sin(πx), and , in hardware. They are used in applications where high-speed is essential, such as in digital signal or graphics applications. We introduce the edge-valued binary decision diagram (EVBDD) as a means of reducing the delay and memory requirements in NFGs. We also introduce a recursive segmentation algorithm, which divides the domain of the function to be realized into segments, where the given function is realized as a polynomial. This design reduces the size of the multiplier needed and thus reduces delay. It is also shown that an adder can be replaced by a set of 2-input AND gates, further reducing delay. We compare our results to NFGs designed with multi-terminal BDDs (MTBDDs). We show that EVBDDs yield a design that has, on the average, only 39% of the memory and 58% of the delay of NFGs designed using MTBDDs.

21-40hit(42hit)